Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(hogql): placeholder expressions (part 1) #25091

Closed
wants to merge 13 commits into from

Conversation

mariusandra
Copy link
Collaborator

@mariusandra mariusandra commented Sep 19, 2024

Problem

Pulls out the first chunk from #25040

Changes

  • Replaces ast.Placeholder with ast.Block.
    • Before: ast.Placeholder(chain=["hogql_val_1"])
    • After: ast.Block(declarations=[ast.ExprStatement(expr=ast.Field(chain=["hogql_val_1"]))])
  • It's not prettier at the AST level, but blocks let us capture complex expressions in placeholders
    • For now we prohibit all blocks with more than 1 statement
    • We also prohibit blocks where the statement is not a simple field access
    • This means there should be zero functional changes
  • The ability to run bytecode in these blocks will be a followup PR.

How did you test this code?

Updated a lot of tests. MyPy was my friend.

@posthog-bot
Copy link
Contributor

It looks like the code of hogql-parser has changed since last push, but its version stayed the same at 1.0.40. 👀
Make sure to resolve this in hogql_parser/setup.py before merging!

@mariusandra mariusandra marked this pull request as ready for review September 21, 2024 08:22
@mariusandra mariusandra marked this pull request as draft September 22, 2024 22:50
Copy link
Member

@Gilbert09 Gilbert09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this makes sense, natural evolution of ast.placeholder - the biggest nit here would be to remove all references of placeholders if we're totally moving to Blocks

type: Optional[Type] = None

@cached_property
def placeholder_chain(self) -> str | None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd return a list[str | int] like a normal ast.Field(...).chain - feels like this would be more aligned

def visit_block(self, node: ast.Block):
# TODO: remove all this when we can use bytecode in placeholders
if len(node.declarations) > 1:
raise QueryError("Placeholders can only contain a single declaration")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "Blocks can only contain a single declaration"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of the language mix of "block" and "placeholder" - especially with removing ast.Placeholder node. I'd be in favour of removing all placeholder references in favour of "block" here

@@ -40,6 +41,21 @@ class VariableDeclaration(Declaration):
expr: Optional[Expr] = None


@dataclass(kw_only=True)
class Block(Expr):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a util that creates a Block from similar interface as a ast.Placeholder could be nice:

def block_from_field_chain(chain: list[str | int]) -> ast.Block:
	return ast.Block(....)

@mariusandra
Copy link
Collaborator Author

The biggest problem here is that it makes the parser a lot slower. Not substantially in my testing, but enough to cause a difference in CI --> one test that took 22min now times out at 30 😬. So work to do and alternatives to consider

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants